77

Chapter 3

rotease Cleavage Pattern Discovery

protein functions when it interacts with molecules or

emicals. Among many formats of interactions, protease

avage is one of the widely researched subjects for several

cades. This type of research aims to build up a predictive

del based on collected laboratory data to discover novel

eractions. Such a model is commonly established based on

oratory-verified protease cleavage data, in which the

ociation knowledge between protease cleavage structure

d protease cleavage function can be examined. A protease

avage structure commonly means a primary sequence (or

b-sequence or peptide) which is believed to contain a

ecific amino acid composition pattern or trend in

ationship with the protease cleavage function. In other

rds, the protease cleavage pattern must not show a random

ino acid composition. Instead, the composition of the

ino acids in a data set of protease cleaved peptides should

monstrate a trend for a specific protease to recognise for

interaction. To make a protease cleavage pattern

covery model to work efficiently, two types of peptides

collected and pooled together for constructing a model.

ey are the cleaved peptides and the non-cleaved peptides.

n-cleaved peptides definitely must have no trend of the

ino acid composition at all. Instead, they must show

ndom distribution of the amino acids. By the contrast

mparison between a data set with a trend and a data set

thout any trend, a pattern by which two types of data can